Final Project: Data visualisation and analysis of social responsibilities of global warming in terms of CO2 emissions and average temperature
Indhold
Final Project: Data visualisation and analysis of social responsibilities of global warming in terms of CO2 emissions and average temperature¶
by Kathrine Schultz-Nielsen (s183929), David Ribberholt Ipsen (s164522), Hanlu He (s183909)¶
Course: 02806 - Social data analysis and visualization spring 2022
Course responsible: Sune Lehmann Jørgensen
DTU - Technical University of Denmark¶
1. Introduction and Motivation¶
Global warming is no longer a problem we can look the other way and threat on the earth’s ecosystem, subsequently our survival increases as we speak. CO2 emission is what most considers as the primary driver for climate change for its impact on increasing average temperature of the earth. The amount of atmospheric concentration of carbon dioxide has increased by 50% since the industrilisation era due to human activities [1]. With increasing living standards over the past decades were a result of us exploiting non-renewable enegry sources such as coal, oil and natural gas, lack of awareness of sustainable infrastructure design in urbanization processes and rapid population growth [2]. And we are already facing many of the consequences of global warming such as ice sheets melting, more frequent forest wildfires, rising sea level, more intense heat waves and appearence of more extreme temperatures [3]. These changes are esimtated to be irreversible even 1000 years after if we manage to stop CO2 emissions [4]. We have slowly came to realise the seriousenss of the issue and began drawing out plans to slow down the rising temperature. The paris agreement in 2015 is the largest agreement that binded 196 countries in the world to take actions against global warming, with the ambitious goal of limiting temperature increase to 1.5 degree celcius [5]. In this project we will be presenting the global warming trend with CO2 emissions and global recorded temperature data, assessing contribution of various social factors to CO2 emissions, dig deeper into nation-wise responsibilities for CO2 emissions and lastly looking into the future of temperature rise based on current climate change mitigation policies.
We will be working primarily with two datasets for this project, the Our World in Data CO2 and Greenhouse Gas Emissions database, Climate watch emissions by sector database and the Climate Change: Earth Surface Temperature Data. The reason for choosing the two datasets is because to assess the impact of global warming interms of CO2 emissions and rising temperature requires extensive information in the geographical and time domain, as the matter concerns all of us no matter where we are from and comparison of the past to the present so we can look into the future. And the two datasets do exactly that, with:
The Our Wolrd in Data CO2 and Greenhouse Gas Emission database contains 21591 records of CO2 emissions and various relevant variables such as energy consumption, gdp, population, CO2 emissions by industrial sector etc. for 243 countries and regions with the earliest record from 1750 and newest from 2020.
The Climate watch emissiosn by sector database contains data of CO2 emissions in million tonnes by sectors: Transporation, Building, Industrial Processes for each country from 2018
The Climate Change: Earth Surface Temperature Data contains a collection data files with 8 million+ records of average temperature by cities and countries around the world.
Therefore, we believe the two datasets compliments well with each other to provide us with sufficient information for our visualisations
To tell our story, we aim to find a balance between author driven and user driven narrative with interactive slide shows and annotated graphs. In this way, we are able to present the facts as well as allowing the users to make up their own opinion on the what, why and how of global warming, as considerable number of factors and nations are involved.
2. Dataset segmentation and cleaning¶
Before we delve into the deeper analysis of this project, let us import all the necessary libraries for this project:
# Basic libraries
import numpy as np
import pandas as pd
import json
import matplotlib.pyplot as plt
import seaborn as sns
import math
import time
import itertools
import warnings
warnings.filterwarnings('ignore')
# helper
import pycountry_convert as pc
def country_to_continent(country_code):
try:
country_alpha2 = pc.country_alpha3_to_country_alpha2(country_code)
country_continent_code = pc.country_alpha2_to_continent_code(country_alpha2)
country_continent_name = pc.convert_continent_code_to_continent_name(country_continent_code)
except:
country_continent_name = country_code
return country_continent_name
# Display libraries
from IPython.display import IFrame, display
from IPython.core.display import display as disp, HTML
from ipywidgets import widgets, interact, interactive, fixed, interact_manual
from plotly import __version__
import plotly.express as px
import plotly.offline as pyo
import plotly.express as px
import plotly.io as pio
import plotly.graph_objects as go
from plotly.subplots import make_subplots
pio.renderers.default = 'notebook'
Now, we will proceed to filtering, subsetting and cleaning the original datasets
Filtering & subsetting the full dataset¶
df_co2 = pd.read_csv('data/owid-co2-data.csv',sep = ',',encoding = 'unicode_escape')
df_co2
| iso_code | country | year | co2 | co2_per_capita | trade_co2 | cement_co2 | cement_co2_per_capita | coal_co2 | coal_co2_per_capita | ... | ghg_excluding_lucf_per_capita | methane | methane_per_capita | nitrous_oxide | nitrous_oxide_per_capita | population | gdp | primary_energy_consumption | energy_per_capita | energy_per_gdp | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | AFG | Afghanistan | 1949 | 0.015 | 0.002 | NaN | NaN | NaN | 0.015 | 0.002 | ... | NaN | NaN | NaN | NaN | NaN | 7624058.0 | NaN | NaN | NaN | NaN |
| 1 | AFG | Afghanistan | 1950 | 0.084 | 0.011 | NaN | NaN | NaN | 0.021 | 0.003 | ... | NaN | NaN | NaN | NaN | NaN | 7752117.0 | 9.421400e+09 | NaN | NaN | NaN |
| 2 | AFG | Afghanistan | 1951 | 0.092 | 0.012 | NaN | NaN | NaN | 0.026 | 0.003 | ... | NaN | NaN | NaN | NaN | NaN | 7840151.0 | 9.692280e+09 | NaN | NaN | NaN |
| 3 | AFG | Afghanistan | 1952 | 0.092 | 0.012 | NaN | NaN | NaN | 0.032 | 0.004 | ... | NaN | NaN | NaN | NaN | NaN | 7935996.0 | 1.001732e+10 | NaN | NaN | NaN |
| 4 | AFG | Afghanistan | 1953 | 0.106 | 0.013 | NaN | NaN | NaN | 0.038 | 0.005 | ... | NaN | NaN | NaN | NaN | NaN | 8039684.0 | 1.063052e+10 | NaN | NaN | NaN |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 25186 | ZWE | Zimbabwe | 2016 | 10.738 | 0.765 | 1.415 | 0.639 | 0.046 | 6.959 | 0.496 | ... | 2.076 | 11.50 | 0.820 | 6.21 | 0.443 | 14030338.0 | 2.096179e+10 | 47.5 | 3385.574 | 1.889 |
| 25187 | ZWE | Zimbabwe | 2017 | 9.582 | 0.673 | 1.666 | 0.678 | 0.048 | 5.665 | 0.398 | ... | 2.023 | 11.62 | 0.816 | 6.35 | 0.446 | 14236599.0 | 2.194784e+10 | NaN | NaN | NaN |
| 25188 | ZWE | Zimbabwe | 2018 | 11.854 | 0.821 | 1.308 | 0.697 | 0.048 | 7.101 | 0.492 | ... | 2.173 | 11.96 | 0.828 | 6.59 | 0.456 | 14438812.0 | 2.271535e+10 | NaN | NaN | NaN |
| 25189 | ZWE | Zimbabwe | 2019 | 10.949 | 0.748 | 1.473 | 0.697 | 0.048 | 6.020 | 0.411 | ... | NaN | NaN | NaN | NaN | NaN | 14645473.0 | NaN | NaN | NaN | NaN |
| 25190 | ZWE | Zimbabwe | 2020 | 10.531 | 0.709 | NaN | 0.697 | 0.047 | 6.257 | 0.421 | ... | NaN | NaN | NaN | NaN | NaN | 14862927.0 | NaN | NaN | NaN | NaN |
25191 rows × 60 columns
df_co2['continent'] = df_co2['iso_code'].apply(lambda x: country_to_continent(x))
3. Descriptive statistics about the dataset¶
After the full segmentation and cleaning has been performed, let us try to understand our dataset a little bit better. We will afterwards jump into the Text Analytics part for finding important top business keywords, so it is important to know how large a scale does our analysis have to deal with.
df_co2
| iso_code | country | year | co2 | co2_per_capita | trade_co2 | cement_co2 | cement_co2_per_capita | coal_co2 | coal_co2_per_capita | ... | methane | methane_per_capita | nitrous_oxide | nitrous_oxide_per_capita | population | gdp | primary_energy_consumption | energy_per_capita | energy_per_gdp | continent | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | AFG | Afghanistan | 1949 | 0.015 | 0.002 | NaN | NaN | NaN | 0.015 | 0.002 | ... | NaN | NaN | NaN | NaN | 7624058.0 | NaN | NaN | NaN | NaN | Asia |
| 1 | AFG | Afghanistan | 1950 | 0.084 | 0.011 | NaN | NaN | NaN | 0.021 | 0.003 | ... | NaN | NaN | NaN | NaN | 7752117.0 | 9.421400e+09 | NaN | NaN | NaN | Asia |
| 2 | AFG | Afghanistan | 1951 | 0.092 | 0.012 | NaN | NaN | NaN | 0.026 | 0.003 | ... | NaN | NaN | NaN | NaN | 7840151.0 | 9.692280e+09 | NaN | NaN | NaN | Asia |
| 3 | AFG | Afghanistan | 1952 | 0.092 | 0.012 | NaN | NaN | NaN | 0.032 | 0.004 | ... | NaN | NaN | NaN | NaN | 7935996.0 | 1.001732e+10 | NaN | NaN | NaN | Asia |
| 4 | AFG | Afghanistan | 1953 | 0.106 | 0.013 | NaN | NaN | NaN | 0.038 | 0.005 | ... | NaN | NaN | NaN | NaN | 8039684.0 | 1.063052e+10 | NaN | NaN | NaN | Asia |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 25186 | ZWE | Zimbabwe | 2016 | 10.738 | 0.765 | 1.415 | 0.639 | 0.046 | 6.959 | 0.496 | ... | 11.50 | 0.820 | 6.21 | 0.443 | 14030338.0 | 2.096179e+10 | 47.5 | 3385.574 | 1.889 | Africa |
| 25187 | ZWE | Zimbabwe | 2017 | 9.582 | 0.673 | 1.666 | 0.678 | 0.048 | 5.665 | 0.398 | ... | 11.62 | 0.816 | 6.35 | 0.446 | 14236599.0 | 2.194784e+10 | NaN | NaN | NaN | Africa |
| 25188 | ZWE | Zimbabwe | 2018 | 11.854 | 0.821 | 1.308 | 0.697 | 0.048 | 7.101 | 0.492 | ... | 11.96 | 0.828 | 6.59 | 0.456 | 14438812.0 | 2.271535e+10 | NaN | NaN | NaN | Africa |
| 25189 | ZWE | Zimbabwe | 2019 | 10.949 | 0.748 | 1.473 | 0.697 | 0.048 | 6.020 | 0.411 | ... | NaN | NaN | NaN | NaN | 14645473.0 | NaN | NaN | NaN | NaN | Africa |
| 25190 | ZWE | Zimbabwe | 2020 | 10.531 | 0.709 | NaN | 0.697 | 0.047 | 6.257 | 0.421 | ... | NaN | NaN | NaN | NaN | 14862927.0 | NaN | NaN | NaN | NaN | Africa |
25191 rows × 61 columns
5. What is Causing global warming?¶
5.1. Population growth¶
It is more than apparent that the world population is constantly growing, yet the resources on earth does not grow with it. Every life that comes to join the big party has demand for food, shelter, clothing, and due to advancement in technology and living standards, the list has now become extensive with Mcdonalds to michellin star, straw roof to golden roof, cheap fast fashion to luxury clothing, iphone 13 and its dupes, scooters to ferarris etc. All that demands is shouting production, production and production! Which inevitably increases fossil fuel burning and thereby CO2 emissions. It has been found that 1% of population growth mounts to 1.28% increase in average CO2 emission and thereby, results in global warming [6]. Let’s take a look how the world’s average CO2 emissions has changed with respect to population growth over the 30 years span from 1990 to 2020.
## fill
df_co2 = df_co2.fillna(0)
## select needed columns
df_pop = df_co2[['iso_code', 'country', 'year', 'co2', 'co2_per_capita','population','continent']]
## extract data for all countries from 1990 to 2020
df_pop_30 = df_pop[(df_pop['year'] > 1989) & (df_pop['year'] < 2021)]
world = df_pop_30[df_pop_30['country'] == 'World']
world['pop_change'] = world['population'].pct_change()*100
world['co2_change'] = world['co2'].pct_change()*100
df_world =pd.melt(world[['year','co2','population','co2_change','pop_change']],id_vars=['year'],var_name='CO2_pop', value_name='value')
df_world = df_world.fillna(0)
fig = go.Figure(
data=[
go.Scatter(name='CO2 emissions', x=sorted(list(set(df_world.year))), y=df_world[df_world['CO2_pop']=='co2'].value/1000, yaxis = 'y'),
go.Scatter(name='Population', x=sorted(list(set(df_world.year))), y=df_world[df_world['CO2_pop']=='population'].value, yaxis ='y2')
],
layout={
'yaxis': {'title': 'Average CO2 emissions (billion t)'},
'yaxis2': {'title': 'Population', 'overlaying': 'y', 'side': 'right'}
}
)
# Change the bar mode
fig.update_layout(barmode='group',title = 'Change in worlds annual CO2 emissions and total population')
fig.show()
It is quite apparent that both the world’s population and annual CO2 emissions have been increasing and the perentage change ratio between them has been greater than what was said to be 1:1.28 up until 2010. Around 2015,the increase of average CO2 emissions reached a plateu and in 2020 there is a rather significant decrease, meanwhile the population continues to grow, which is reflected in both plots. The reason for the slowed increase of CO2 emissions in 2015 is most likely due to more global efforts in tackling climate change, especially china for its tightened policies to deal with severe air pollution by burning less coal [7]. And in 2020, the pandemic put the world in lockdown which either halted or reduced almost all industrial activities, hence the significant drop in average CO2 emmissions.
years = [1990,1995,2000,2005,2010,2015,2020]
df_world = df_world[df_world['year'].isin(years)]
change_ratio = np.round(df_world[df_world['CO2_pop']=='co2_change']['value'].values/df_world[df_world['CO2_pop']=='pop_change']['value'].values,3)
fig = make_subplots(rows=1, cols=2)
fig = go.Figure(data=[
go.Bar(name='CO2 emissions', x=years, y=df_world[df_world['CO2_pop']=='co2_change'].value),
go.Bar(name='Population', x=years, y=df_world[df_world['CO2_pop']=='pop_change'].value),
go.Scatter(name = 'Change ratio', x = years, y = change_ratio, text = change_ratio ,textposition="top center",mode='lines+markers+text')
],
layout={
'yaxis': {'title': 'Percentage change (%)'}
})
# Change the bar mode
fig.update_layout(barmode='group',title='Percentage changes of annual CO2 emissions and population')
fig.show()
5.2. Energy consumption¶
With more people in the world and higher demand for living standards just mean that there are more mouth to feed and harder to suffice. We are consuming more and more energy as individuals. This naturally lead to higher global consumption for energy, hence we need to extract more oil, natural gas, burn more coal, and with great hope, turn more towards green energy sources like solar and wind.
df_energy = df_co2[['country', 'year', 'co2_per_unit_energy','primary_energy_consumption', 'energy_per_capita','cement_co2', 'coal_co2', 'flaring_co2', 'gas_co2', 'oil_co2',
'other_industry_co2', 'cement_co2_per_capita', 'coal_co2_per_capita', 'flaring_co2_per_capita', 'gas_co2_per_capita', 'oil_co2_per_capita', 'other_co2_per_capita']]
## extract data for all countries from 1749 to 2019
energy = df_energy[(df_energy['year'] > 1749) & (df_energy['year'] < 2021)]
world_energy = energy[energy['country'] == 'World']
fig = px.line(world_energy[world_energy['energy_per_capita']!=0],
x="year",
y="energy_per_capita",
title="World Energy consumption per capita (1965-2019)",
range_x=[1964, 2020],
range_y= [10000, 24000], markers=True)
fig.update_layout(
{
'yaxis': {'title': 'kilowatt-hours (kWh)'}
}
)
world_fuel = pd.melt(world_energy[['year','cement_co2', 'coal_co2', 'flaring_co2', 'gas_co2',
'oil_co2', 'other_industry_co2']],id_vars=['year'],var_name='CO2_energy', value_name='value')
## convert to billions tonne
world_fuel['value'] = world_fuel['value']/1000
fig = px.line(world_fuel[world_fuel['value']!= 0],
x="year",
y="value",
title="CO2 emissions by fuel type (1750-2019)",
color = 'CO2_energy',labels = {'CO2_energy':'CO2 by fuel type'})
fig.update_layout(hovermode='x unified', yaxis_title= 'CO2 emissions (billion t)')
fig.show()
The production of energy of course contributes to CO2 emissions, click the play buttom below to see how much each fuel/energy type contribute to world’s CO2 emissions.
## find the total co2 emission from each energy type
world_energy['total_co2'] = world_energy.iloc[:,5:11].sum(axis=1)
## calculate share of co2 emission of each energy type
fuel_prc = world_energy.iloc[:,5:11].div(world_energy.total_co2,axis=0)*100
fuel_prc['year'] = list(world_energy.year)
years = fuel_prc.year
trace1 = go.Scatter(x=years,
y=fuel_prc['coal_co2'],
name = 'Coal',
mode='lines',
line=dict(width=1.5),
visible=True)
trace2 = go.Scatter(x = years,
y = fuel_prc['oil_co2'],
name = 'Oil',
mode='lines',
line=dict(width=1.5),
visible=True)
trace3 = go.Scatter(x = years,
y = fuel_prc['gas_co2'],
name = 'Gas',
mode='lines',
line=dict(width=1.5),
visible=True)
trace4 = go.Scatter(x = years,
y = fuel_prc['cement_co2'],
name = 'Cement',
mode='lines',
line=dict(width=1.5),
visible=True)
trace5 = go.Scatter(x = years,
y = fuel_prc['flaring_co2'],
name = 'Flaring',
mode='lines',
line=dict(width=1.5),
visible=True)
trace6 = go.Scatter(x = years,
y = fuel_prc['other_industry_co2'],
name = 'Other industry',
mode='lines',
line=dict(width=1.5),
visible=True)
frames = [dict(data= [dict(type='scatter',
x=fuel_prc['year'][:k+1],
y=fuel_prc['coal_co2'][:k+1]),
dict(type='scatter',
x=fuel_prc['year'][:k+1],
y=fuel_prc['oil_co2'][:k+1]),
dict(type='scatter',
x=fuel_prc['year'][:k+1],
y=fuel_prc['gas_co2'][:k+1]),
dict(type='scatter',
x=fuel_prc['year'][:k+1],
y=fuel_prc['cement_co2'][:k+1]),
dict(type='scatter',
x=fuel_prc['year'][:k+1],
y=fuel_prc['flaring_co2'][:k+1]),
dict(type='scatter',
x=fuel_prc['year'][:k+1],
y=fuel_prc['other_industry_co2'][:k+1])
],
traces= [0, 1, 2, 3, 4, 5],
)for k in range(1, len(fuel_prc)-1)]
layout = go.Layout(hovermode='x unified', updatemenus=[dict(type="buttons", direction="right", x=0.9, y=1.16),
],
xaxis=dict(range=[1749,2020],
autorange=False, tickwidth=2,
title_text="Year"),
yaxis=dict(range=[-5, 105],
autorange=False,
title_text="%"),
title="Percentage share of CO2 emissions by energy/fuel type (1750-2019)",
)
fig = go.Figure(data=[trace1, trace2, trace3, trace4, trace5, trace6], frames=frames, layout=layout)
fig.update_layout(
updatemenus=[
dict(
buttons=list([
dict(label="Play",
method="animate",
args=[None,
dict(frame=dict(duration=5,
redraw=False),
transition=dict(duration=0),
fromcurrent=True,
mode='immediate')])
]))])
fig.show()
As we can see coal burning has dominated world’s energy generation from the early industrialisation years and still is the most used fuel type together with oil, thereby still the largest contributer to CO2 emissions. From early 1990s we start to see the rise of other energy sources, which includes renewable energy and there seems to be a steady growth.
5.3. Human acitivities and Wealth (GDP)¶
So we are using more energy than ever before, but for what? The answer is that we travel and transport more, we are manufacturing more goods, we construct more buildings, we further industrialise our productions.
Please play around with the pie chart to see how different kinds of human activity contribute a country, continent and the world’s CO2 emissions.
def countryname_to_continent(country_name):
try:
country_code = pc.country_name_to_country_alpha2(country_name, cn_name_format="default")
country_continent_code = pc.country_alpha2_to_continent_code(country_code)
country_continent_name = pc.convert_continent_code_to_continent_name(country_continent_code)
except:
country_continent_name = country_name
return country_continent_name
df_human = pd.read_csv('data/co2_humanactivities.csv',sep = ',',encoding = 'unicode_escape')
df_human['Continent'] = df_human['Country'].apply(lambda x: countryname_to_continent(x))
## filter out to keep only countries
df_human = df_human[df_human['Continent'].isin(['Africa','Asia','Europe','North America','South America'])]
fig = px.sunburst(df_human, path=['Continent', 'Country','Sector'], values='2018', color='Continent',width=1300, height=800)
fig.update_traces(textinfo="label+percent entry ")
fig.update_layout(
title = 'CO2 emissiosn by human activities',
)
fig.show()
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
/var/folders/wv/j_lqm8yn4ndb1hvhthjcrb540000gn/T/ipykernel_19339/2666037783.py in <module>
9 return country_continent_name
10
---> 11 df_human = pd.read_csv('data/co2_humanactivities.csv',sep = ',',encoding = 'unicode_escape')
12 df_human['Continent'] = df_human['Country'].apply(lambda x: countryname_to_continent(x))
13 ## filter out to keep only countries
/opt/homebrew/Caskroom/miniforge/base/envs/cforge_env/lib/python3.9/site-packages/pandas/util/_decorators.py in wrapper(*args, **kwargs)
309 stacklevel=stacklevel,
310 )
--> 311 return func(*args, **kwargs)
312
313 return wrapper
/opt/homebrew/Caskroom/miniforge/base/envs/cforge_env/lib/python3.9/site-packages/pandas/io/parsers/readers.py in read_csv(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, encoding_errors, dialect, error_bad_lines, warn_bad_lines, on_bad_lines, delim_whitespace, low_memory, memory_map, float_precision, storage_options)
584 kwds.update(kwds_defaults)
585
--> 586 return _read(filepath_or_buffer, kwds)
587
588
/opt/homebrew/Caskroom/miniforge/base/envs/cforge_env/lib/python3.9/site-packages/pandas/io/parsers/readers.py in _read(filepath_or_buffer, kwds)
480
481 # Create the parser.
--> 482 parser = TextFileReader(filepath_or_buffer, **kwds)
483
484 if chunksize or iterator:
/opt/homebrew/Caskroom/miniforge/base/envs/cforge_env/lib/python3.9/site-packages/pandas/io/parsers/readers.py in __init__(self, f, engine, **kwds)
809 self.options["has_index_names"] = kwds["has_index_names"]
810
--> 811 self._engine = self._make_engine(self.engine)
812
813 def close(self):
/opt/homebrew/Caskroom/miniforge/base/envs/cforge_env/lib/python3.9/site-packages/pandas/io/parsers/readers.py in _make_engine(self, engine)
1038 )
1039 # error: Too many arguments for "ParserBase"
-> 1040 return mapping[engine](self.f, **self.options) # type: ignore[call-arg]
1041
1042 def _failover_to_python(self):
/opt/homebrew/Caskroom/miniforge/base/envs/cforge_env/lib/python3.9/site-packages/pandas/io/parsers/c_parser_wrapper.py in __init__(self, src, **kwds)
49
50 # open handles
---> 51 self._open_handles(src, kwds)
52 assert self.handles is not None
53
/opt/homebrew/Caskroom/miniforge/base/envs/cforge_env/lib/python3.9/site-packages/pandas/io/parsers/base_parser.py in _open_handles(self, src, kwds)
220 Let the readers open IOHandles after they are done with their potential raises.
221 """
--> 222 self.handles = get_handle(
223 src,
224 "r",
/opt/homebrew/Caskroom/miniforge/base/envs/cforge_env/lib/python3.9/site-packages/pandas/io/common.py in get_handle(path_or_buf, mode, encoding, compression, memory_map, is_text, errors, storage_options)
699 if ioargs.encoding and "b" not in ioargs.mode:
700 # Encoding
--> 701 handle = open(
702 handle,
703 ioargs.mode,
FileNotFoundError: [Errno 2] No such file or directory: 'data/co2_humanactivities.csv'
If we use more energy in order to meet higher demands of production also means that we will generate more wealth, hence growing the economy. It is shown that there are clear correlation between increasing wealth and CO2 emissions.
wealth = df_co2[['year', 'country','continent','co2', 'co2_per_capita','gdp','co2_per_gdp','population','share_global_co2']]
world_wealth = wealth[(wealth['country'] == 'World') & wealth['co2_per_gdp']]
world_wealth['co2'] = world_wealth['co2']/1000
fig = px.scatter(world_wealth,
x="gdp",
y="co2",
title="CO2 emissions vs. GDP (1965-2019)",trendline='ols')
fig.update_layout(yaxis_title= 'CO2 emissions (billion t)', xaxis_title= 'GDP ($)'
)
However, not everyone creates the same size of carbon footprint. We know that wealth is not distributed equally between us, some live in extreme poverty and some can get their hands on almost every resources possible. And this would also mean that we contribute differently to CO2 emissions. Let’s take a look at how CO2 emissions per person changes depending on how rich individuals are in a country in 2018.
## filter out non country rows
wealth_18 = wealth[(wealth['year'] == 2018) & (wealth['continent'].isin(['Asia','Europe','Africa','North America','Oceania','South America']))]
## calculate gdp per capita
wealth_18['gdp_per_capita'] = wealth_18['gdp']/wealth_18['population']
wealth_18['share_global_co2'] = wealth_18['share_global_co2']
## remove rows with missing values
wealth_18 = wealth_18[(wealth_18['gdp']!=0) & (wealth_18['share_global_co2']!=0)]
fig = px.scatter(wealth_18, x="gdp_per_capita", y="co2_per_capita",
size="population", color = 'continent',
hover_name="country", log_x=True, log_y=True, size_max=60,trendline='ols',trendline_scope="overall", title = 'CO2 per capita vs GDP per capita')
fig.update_layout(
xaxis_title="GDP per capita ($)",
yaxis_title="CO2 per capita (t)"
)
fig.show()
fig = px.treemap(wealth_18, path=[px.Constant("world"), 'continent', 'country'], values='co2_per_capita',
color='gdp_per_capita', hover_data=['co2'],
color_continuous_scale='RdBu',
color_continuous_midpoint=np.average(wealth_18['gdp_per_capita'], weights=wealth_18['co2_per_capita']))
fig.update_layout(margin = dict(t=50, l=25, r=25, b=25))
fig.show()
8. Discussion¶
References¶
9. Contributions¶
Website creation and setup: Hanlu
Part 1: Introduction and Motivation: ****
Part 2: Dataset segmentation and cleaning: ****
Part 3: Descriptive statistics about the dataset: ****
Part 4: Keyword detection using TF-IDF: ****
Part 5: Network Analysis upon the Yelp hotel dataset: ****
Part 6: Topic detection using LDA: ****
Part 7: Sentiment Analysis upon the hotel reviews: ****
Part 8: Discussion: ****